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AB - JP2000305579 NOVELTY - A reference signal is extracted from . 
communication channel and power of audio component and ngise 
component are estimated In power and noise estimation units \ ,2), 
respectively. Based on the estimated noise power, reference signal 
is judged to be audio signal. Assessed value for noise power 
estimation in noise power estimation unit is updated, based on 
judgment result. 

- USE - For intercoms, telephones used in residence, office, factory. 

- ADVANTAGE - By updating assessed value for noise power 
estimation, the bacl<ground noise power estimation process is 
suspended when acoustic coupling component included in 
reference signal is large, and hence presence of audio component 
is reference signal can be accurately detected. 

- DESCRIPTION OF DRAWING(S) - The figure shows the block 
diagram of amplifying call machine. 

- Noise estimation- units 1,2 
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Tl - SPEECH DETECTING DEVICE 

AB - PROBLEM TO BE SOLVED: To precisely detect whether a • 
reference signal contains a speech signal or not in a system 
presenting a large acoustic coupling gain or under a condition with 
a high background noise level. 
- SOLUTION: A background noise power estimating and switching 
part 4 stops an estimation processing in a. background noise power 
estimating part 2 when the ratio of an acoustic coupling gain 
component contained in a reference signal Vx.is large, This makes 
a background noise power estimation value Pn becomes close to a 
value approxihiating a background noise power around a 
microphone 10 regardless of the phone call state. Thus, this device 
reduces a deterioration of a speech detection performance caused 
by a turn-around of a background noise on a far side to the 
microphone 10 in a system with the microphonelO and a speaker 
1 1 in a short distance causing a large acoustic coupling gain, and 
precisely detects whether a reference signal contains a speech 
signal or not under a condition with a high background noise level. 
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Japan Patent Office is not responsible for any 
damages caused by the use of this translation. 

1. This document has been translated by computer. So the translation may not reflect the original 
precisely. 

2. **** shows the word which can not be translated. 
3. In the drawings, any words are not translated. 



DETAILED DESCRIPTION 



[Detailed Description of flie Invention] 

[0001] » , u r 

[The technical field to which invention belongS'J^this-invention relates to the voice detector for 
carrying a normal-mode-reiection function; a v^icV^haftift &libtion,. etc. in the speaking circuit u 
the **** telephone call 
office, works, etc. 
[0002] ^ . 

Pescription of the Prior Art] Generally, avoice detector>used in order to detect whether the 
acoustic signal collected with the microphohe-eontains the sound signal. The typical example of 
composition of such a voice detector is shown in drawing 5 . This voice detector VD' is equipped 
with the instant power presuihption section 20, the background-noise power presumption section 21, 
and the comparison-test section 22. An integrating circuit or a digital filter with the property with 
loose falling that a standup is steep and etc. is realized, and the instant power presumption section 20 
presumes the short-time average power of a reference signal (acoustic signal collected with a 
microphone). Moreover, an integrating circuit or a digital filter with a property with falling steep 
gently [ the background-noise power presumption section 21 / a standup ] etc. is realized, and the 
background-noise (background noise) level which exists regularly in a reference signal is presumed. 
Furthermore, by comparing with a predetermined threshold the ratio of the instant power estimate 
calculated by the instant power presumption section 20, and the background-noise power estimate 
calculated by the background-noise power presumption section 21, the comparison-test section 22 
judges whether the reference signal contains the sound signal (detection), and outputs the binary 
signal (detecting signal) of H or L. 
[0003] 

[Problem(s) to be Solved by the Invention] However, when preparing above voice detector VD' m 
the internal circuitry ofjj^;^!ltele2hone call machine (not shown) which has a loudspeaker and a 
microphone, the wr ggaround componen gtacoustic turnover component) from a loudspeaker is 
contained in the acetic signal coUect'ed with a microphone. When the rate of the wraparound 
component contained in this acoustic signal is large, it becomes difficult to detect whether the 
speaker who is near the microphone which is the original purpose uttered voice. For example, the 
background-noise level near the telephone call terminal by the side of a far edge is large, and when a 
microphone collects the background noise by the side of a far edge through an acoustic turnover, the 
background-noise power estimate calculated by the background-noise power presumption section 21 
in the above-mentioned conventional example becomes large. Consequently, also in the state where 
the speaker who is near the microphone uttered voice, the ratio of instant power estimate and 
background-noise power estimate is small, and there is a possibility that it may be incoirect-detected 
as not being a sound signal (non-voice) in the comparison-test section 22, without the ability 
exceeding a predetermined threshold. . i j • u 

[0004] Succeeding in t^is-mventiom-in^ew of the above-mentioned problem, the place made mto the 
purpose is to offer th£^ce detector wWch can detect with a sufficient precision whether the sound 
signal IS contained iivthereference signal under the situation that background-noise level is large. 
[0005] 

[Means for Solving the Problem] It is used for the above-mentioned **** telephone call terminal of 
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the **** telephone call system which the **** telephone call tenninal which has a microphone and a 
loudspeaker is connected to other telephone call terminals or **** telephone call terminals, and 
performs a half-duplex telephone call in order that invention of a claim 1 may attam the above- 
mentioned purpose. The instant power presumption section which presumes the mstant power of the 
reference signal which is the voice detector which detects whether the signal transmitted to a channel 
is a sound signal, or it is a non- voice signal, and was taken out from the above-mentioned channel, 
The background-noise power presumption section which presumes the power of the background- 
• noise component contained in the above-mentioned reference signal. While judgmg whether a 
reference signal is a sound signal or it is a non-voice signal based on the background-noise power • 
estimate presumed in the instant power estimate and the above-mentioned background-noise power 
presumption section which are presumed in the above-mentioned instant power presumption section 
The 1st voice / non-voice judging section which holds the last judgment result until a judgment result 
is updated It is characterized by having the background-noise power presumption change section 
which changes updating/halt of the background-noise power estimate m the above-mentioned 
background-noise power presumption section. When the rate of the acoustic mmover component 
contained in a reference signal is large, while suspending processing of the background-noise power 
presumption section by the background-noise power pf esumption change section.Smce the judgment 
by the 1st voice / non-voice judging section is perforh\ed b^edJr|Ui.^3^ackground-noise ppwer 
estimate which asked in the situation that the rate of the acoustic tfimbv.ePi^d^p&nprit contained m a 
reference signal is small, and the background-noise power presumption secti6n-.tield.'.»^ljen the 
background noise in the telephone call terminal by the side of a far edge is sent out from a 
loudspeaker and turns to a microphone The situation that it becomes impossible to detect a sound 
• signal even if the ratio of the voice component which the near end side speaker m the acoustic signal 
which a microphone collects emits, and the other background-noise component originates m a bird 
clapper small and the sound signal is contained in the reference signal can be reduced. It is detectable 
with a sufficient precision whether a reference signal is a sound signal under the situation that 
background-noise level is large. •. . j ^ »• 

[0006] voice the non-detecting section where a reference signal finds voice the non-detectmg 
duration detected as it is a non-voice signal according to the judgment result according [ invention of 
a claim 2 / on invention of a claim 1, and ] to the voice / non-voice judging section of the above 1st -- 
a time check ~ with the section It has the 2nd voice / non-voice judging section which judges any of 
a sound signal and a non-voice signal the above-mentioned reference signals are from voice the non- 
detecting duration found by the section, this voice non-defecting section ~ a time check - Voice the 
non-detecting duration found by the section is abbreviation regularity over time to be the phoneme 
duration grade of human being's voice, this ~ the 2nd voice / non-voice judging section - the above- 
mentioned voice non-detecting section ~ a time check - And it is charactenzed by judging all the 
reference signals of this voice non-detecting duration to be a sound signal, and changing, when the 
above-mentioned voice non-detecting duration is the pitch period grade of human being's voice, 
r when a sound signal is not detected in the 1st voice / non-voice judging section under the situauon 
that background-noise level is very large ] voice the non-detecting section ~ a time check - almost 
uniformly, in being almost equal to the pitch interval of human being's voice, while voice the non- 
detecting duration measured in the section is an audio phoneme duration grade Since a reference 
signal is anew judged as a sound signal in the 2nd voice / non-voice judging section, m i* 
telephone call system with large background-noise level, a sound signal can be detected with a still 
more sufficient precision. • 

[0007] . , 

jv fEmbodiments of the Invention] (Operation gestalt 1) Drawing 1 is the block diagram showmg the 
telephone call machine M which has the voice detector VDl in the operation gestalt 1 of this 
W^Vy invention This ***♦ telephone call machine M is equipped with a microphone 10. a loudspeaker 11, 
SS^A^ the microphone amplifier 15. the loudspeaker amplifier 16, the voice detector VDl, and voice switch 
Crl^^VS and is connected-w.ilhother *■*** telephone call machines etc, through a circuit. Voice switch 

VTEere The\acoustic tumoveisfrom a loudspeaker 1 1 to a microphone 10. And it is what oppresses a 
howling by reaucing.the-gain'of the closed loop formed of the wraparound by the side of a circuit. 
The transmission side attenuator 12 inserted on the transmission signal line for transmitting the 
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sound signal (transmission signal) which collects a sound with a microphone 10 to a circuit, It-has 
the receiver side attenuator 13 inserted on the receiver signal line for transniitting the sound signal 
(receiver signal) which received from the circuit to a loudspeaker 11, and the amount control section 
14 of insertion losses which controls the gain of the transmission side attenuator 12 and the receiver 
side attenuator 13 according to a talk state. It **, and in the amount control section 14 of msertion 
losses a transraission-and-reception talk signal is observed, a talk state is judged, and the gain of the 
transmission side attenuator 12 and the gain of the receiver side attenuator 13 are appropriately set up 

according to a talk state. r u <• 

[0008] The instant power presumption section 1 which presumes the. instant power ot the reterence 
signal (transmission signal) Vx which took out the voice detector VDl concerning this invention 
from the channel (transmission signal line), The backgrp^up|^noise, power presumption section 2 
which presumes the power of the background-noise component co%'g(ineid. in:th.e reference signal Vx, 
While judging whether the reference signal Vx is a sound signal or it^s'a%Qhji^6ice signal phased on 
the background-noise power estimate Pn presumed in the instant power estimate' Ps and tjie 
background-noise power presumption section 2 which are presuiried in the instant power 
presumption section 1 It has the 1st voice / non-voice judging section 3 holding the last judgment 
result, and the background-noise power presumption change section 4 which changes updating/halt 
of the' background-noise power estimate Pn in the background-noise power presumption section 2 
until a judgment result is. updated. 

[0009] It is constituted by an integrating circuit or a digital filter etc. in which falling has a loose 
property steeply [ a standup ] as for the instant power presumption section I. .Moreover, it is 
constinited by the integrating circuit or digital filter in which falling has a steep property gently ( a 
standup ] as for the background-noise power presumption section 2. 

[0010] The comparator CP 1 which, on the other hand, outputs the binary signal Dl of H or L for the. 
instant power estimate Ps outputte.d from the instant power presumption section 1 as the 1st voice / 
non-voice judging section 3 are shown in drawing 2 as compared with the predetermined threshold 
PsO the ratio of the instant power estimate Ps and the background-noise power estimate Pn outputted 
from the background-noise power presumption section 2 ~ with divider 3a which calculates Ps/Pn It 
is constituted by the comparator CP 2 which outputs the binary signal D2 of H or L for output-value 
Ps/Pn of divider 3a as compared with predetermined threshold deltai and AND-operation section 3b 
which searches for the AND of two binary signals Dl and D2. The instant power estimate Ps is 
larger than a tiireshold PsO (Ps>PsO), and in this operation gestalt, it when output Ps/PsO of 
divider 3 a is larger than threshold delta (Ps/PsO>delta), it judges with a sound signal, and in the case 
of others, it judges with a non-voice signal. Here, a threshold PsO is a threshold which specifies the 
minimum level of a sound signal, and threshold delta is a threshold which specifies the minimum 
ratio of sound signal level and back-ground-noise level. 

[00 1 1 ] The background-noise power presumption changg ^rti on 4 4s4Hms d on and o ff by the 
control signal Vs outputted from the amount contro li^tion 14 of insertion iosses"5p voice switch 
VS. and the input of the reference signal Vx over the background-noise power presumption section 2 
consists of close / a switch which carries out OFF. And the background-noise power presumption 
section 2 serves as the update mode, when the background-noise power presumption change section 
4 nims on and the reference signal Vx is inputted, and when the background-noise power 
presumption change section 4 turns off and the reference signal Vx is not inputted, it serves as halt 
mode. In the update mode, the background-noise power presumption section 2 updates the 
background-noise power estimate Pn serially with reference to the reference signal Vx here. (/\^i< 
Moreover, in halt mode, the background-noise power presumptiori section 2 suspends the above- 
mentioned data processing^ d holds the value calculated before it as background-noise power 
estimate Pn. 



esnmaie rn. ^ — ^\ - i /, 

[0012] Here, when a talk state was judged to be a( receiver sta t^ while the amount control section 14 
of insertion losses of voice switch VS mmed off the^background-noise power presumption change 
section 4 with the control signal Vs, when it judges with a transmission state, it turns on the 
background-noise power presumption change section 4 with a control signal Vs. Since it **, it 
becomes halt mode in a receiver state in the background-noise power presumption section 2 and it _ 
becomes the update mode in a transmission state, it sets to the voice detector VDl. When the rate of 
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the acoustic turnover component contained in the reference signal Vx is large, by suspendiiig 
presumed processi ng of the background=nQis_ej).0-W-er-presumpl^^^^ 2, it can consider as the 

value^which approximated the background-noise power of the microphone 10 circumference, without 
depending the background-noise power estimate Pn on a talk state. Consequently, the distance 
between a microphone 10 and a loudspeaker 1 1 is short, and degradation of the voice detectability 
ability by the background noise by the side of a far edge turning to a microphone 10 also in a system 
with large acoustic turnover gain can be reduced. In addition, the detecting signal (detection flag) SD 
1 of the voice detector VDl is given to for example, voice switch VS, and is used for various control. 

[0013] When the rate of the acoustic tum'^v^^^^ in the reference signal Vx is 

large according to the voice detector ypr 'applied tpjhi^ iny as mentioned above, while 
suspending processing of the backgcound-noisd powe^,pV^esumption^sea the backgiround- 

noise power presumption change section 4 Since the judgment bj? the 1st Voice / non- voice judging 
section 4 is performed based on the background-noise power estimate Pn which asked in the 
situation that the rate of the acoustic turnover component contained in the reference signal Vx is 
small, and the backgroiind-noise power presumption section 2 held, When the background noise in 
the telephone call terminal by the side of a far edge is sent out from a loudspeaker 1 1 and turns to ia 
microphone 10 The situation that it becomes impossible to detect a sound signal even if the ratio of 
the voice component which the near end side speaker in the acoustic signal which a microphone 10 
collects emits, and the other background-noise component originates in a bird clapper small and the 
sound signal is contained in the reference signal Vx can be reduced. Also in a system with a short 
distance of a before [ from a loudspeaker 1 1 / a microphone 10 ], and large acoustic turnover gain, it 
is detectable with a sufficient precision whether the reference signal Vx is a sound signal. 
[0014] (OMi;atiQn gestalt 2) Drawing 3> show$ the block diagram of the voice detector VD2 in the 
operation/gestalt^bf this invention. However, since the fundamental composition of this operation 
gestalt is cdmmon in the operation gestalt 1, the same sign is given to common composition and 
explanation is omitted. 

[0015] This operation gestalt is based on the detecting signal SD 1 of the 1st voice / non-voice 
judging section 3. the time taul arid tau2 judged as the reference signal Vx being a non- voice signal . 
as shown in drawing 4 , i.e., time for a detecting signal SD 1 to be L level, (henceforth "voice non- 
detecting duration"), and voice the non-detecting section which asks for — a time check — with the 
section 5 voice the non-detecting section - a time check - the feature is in the point equipped with 
the 2nd voice / non- voice judging section 6 which judges any of a sound signal and a non-voice 
signal the reference signals Vx are based on the voice non-detecting duration taul and tau2, and - 
which were called for by the section 5 • 

[0016] here - voice the non-detecting section - a time check - whenever a detecting signal SD 1 
changes from L to H in the section 5 - a time check - although processing is reset - the time check 
before it - between the phoneme duration grades in a sound signal holds the result (voice non- 
detecting duration taul -) for storage meanses, such as RAM, at least 

[0017] The voice non-detecting duration taul and tau2 memorized by the storage means of the 
section 5, -, tauN are referred to. moreover - the 2nd voice / non-voice section judging section 6 - 
voice the non-detecting section - a time check ~ Over time for these values to be the phoneme 
duration grades of human being's voice, in being abbreviation regularity and being the pitch interval . 
grade of human being's voice, these sections tau 1 - tauN are anew detected as the voice section, and 
it outputs the detecting signal SD 2 of H level (refer to drawing 4 ). 

[001 8] the ambient noise level [ in / the microphone 10 neighborhood / if it ** and twists in this 
operation gestalt, as it is shown in drawing 4 ] VN - high - the ratio of the instant power estimate 
PS and the background-noise power estimate Pn, since Ps/Pn is small In spite of containing voice in. 
the reference signal Vx, when being judged as a non-voice signal in the 1st voice / non-voice judging 
section 3, it becomes possible to detect as a sound signal anew in the 2nd voice / non- voice judging 
section 6. Consequently, there is an advantage that a sound signal can be detected with a still more 
sufficient precision also in a **** telephone call system with large background-noise level to the 
operation gestalt 1 . . * 

[0019] 
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[Effect of the Invention] Invention of a claim 1 is used for the above-mentioned **** telephone call* 
terminal of the **** telephone call system which the ♦*** telephone call terminal which has a 
microphone and a loudspeaker is connected to other telephone call terminals or ♦*** telephone call 
terminals, and performs a half-duplex telephone call. The instant power presumption section which 
presumes the instant power of the reference signal which is the voice detector which detects whether 
the signal transmitted to a channel is a sound signal, or it is a non-voice signal, and was taken out 
from the above-mentioned channel, The background-noise power presumption section which 
presumes the power of the background-noise component contained in the. above-mentioned reference 
signal, While judging whether a reference signaKis a sound signal* or it is a non-voice signal based on 
the background-noise power estimate presumed in the instant power estimate and the above- 
mentioned background-noise power presumption section which are presumed in the above- 
mentioned instant power presumption section Since it had the 1st voice / non-voice judging section 
holding theJastjiidgqient result, and the background-noise power presumption change section which 
change^updating/halt^f the background-noise power estimate in' the aboye-mentioned background- 
noise power presumpuon section until the judgment result was updated When the rate of the acoustic 
turnover component contained in a reference signal is large, while suspending processing of the 
background-noise power presumption section by the background-noise power presumption change 
section Since the judgment by the 1st voice / non-voice judging section is performed based on the 
background-noise power estimate which asked in the situation that the rate of the acoustic turnover 
component contained in a reference signal is small, and the background-noise power presumption 
section held, When the background noise in the telephone call terminal by the side of a far edge is 
sent out from a loudspeaker and turns to a microphone The situation that it becomes impossible to 
detect a sound signal even if the ratio of the voice component which the near end side speaker in the 
acoustic signal which a microphone collects emits, and the other background-noise component 
originates in a bird clapper small and the sound signal is contained in the reference signal can be • 
reduced. The effect that it is detectable with a sufficient precision is [ whether a reference signal is a 
sound signal and ] under the situation that background-noise level is large. 
[0020] voice the non-detecting section where a reference signial finds voice the non-detecting 
duration detected as it is a non- voice signal according to the judgment result according [ invention of 
a claim 2 ] to the voice / non-voice judging section of the above 1st — a time check — with the 
section It has the 2nd voice / non- voice judging section which judges any of a sound signal and a 
non-voice signal the above-mentioned reference signals are from voice the non-detecting duration 
found by the section, this voice non-detecting section - a time check - Voice the non-detecting 
duration found by the section is abbreviation regularity over time to be the phoneme duration grade 
of human being's voice, this — the 2nd voice / non- voice judging section — the above-mentioned 
voice non-detecting section — a time check - And since all the reference signals of this voice non- 
detecting duration are judged to be a sound signal and it changes when the above-mentioned voice 
non-detecting duration is the pitch period grade of human being's voice [ when a sound signal is not 
detected in the 1st voice / non-voice judging section under the situation that background-noise level 
is very large ] voice the non-detecting section — a time check — almost uniformly, in being almost 
equal to the pitch interval of human being's voice, while voice the non-detecting duration measured 
in the section is an audio phoneme duration grade Since a reference signal is anew judged as a sound 
signal in the 2nd voice / non-voice judging section, it is effective in the ability to detect a sound 
signal with a still more sufficient precision in a ♦*** telephone call system with large background- 
noise level. 



[Translation done.] 
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